Topic modeling for untargeted substructure exploration in metabolomics.
نویسندگان
چکیده
The potential of untargeted metabolomics to answer important questions across the life sciences is hindered because of a paucity of computational tools that enable extraction of key biochemically relevant information. Available tools focus on using mass spectrometry fragmentation spectra to identify molecules whose behavior suggests they are relevant to the system under study. Unfortunately, fragmentation spectra cannot identify molecules in isolation but require authentic standards or databases of known fragmented molecules. Fragmentation spectra are, however, replete with information pertaining to the biochemical processes present, much of which is currently neglected. Here, we present an analytical workflow that exploits all fragmentation data from a given experiment to extract biochemically relevant features in an unsupervised manner. We demonstrate that an algorithm originally used for text mining, latent Dirichlet allocation, can be adapted to handle metabolomics datasets. Our approach extracts biochemically relevant molecular substructures ("Mass2Motifs") from spectra as sets of co-occurring molecular fragments and neutral losses. The analysis allows us to isolate molecular substructures, whose presence allows molecules to be grouped based on shared substructures regardless of classical spectral similarity. These substructures, in turn, support putative de novo structural annotation of molecules. Combining this spectral connectivity to orthogonal correlations (e.g., common abundance changes under system perturbation) significantly enhances our ability to provide mechanistic explanations for biological behavior.
منابع مشابه
A Conversation on Data Mining Strategies in LC-MS Untargeted Metabolomics: Pre-Processing and Pre-Treatment Steps
Untargeted metabolomic studies generate information-rich, high-dimensional, and complex datasets that remain challenging to handle and fully exploit. Despite the remarkable progress in the development of tools and algorithms, the "exhaustive" extraction of information from these metabolomic datasets is still a non-trivial undertaking. A conversation on data mining strategies for a maximal infor...
متن کاملUntargeted metabolomics suffers from incomplete data analysis
Introduction: Untargeted metabolomics is a powerful tool for biological discoveries. Significant advances in computational approaches to analyzing the complex raw data have been made, yet it is not clear how exhaustive and reliable are the data analysis results. Objectives: Assessment of the quality of data analysis results in untargeted metabolomics. Methods: Five published untargeted metabolo...
متن کاملAmino Acid Metabolism is Altered in Adolescents with Nonalcoholic Fatty Liver Disease-An Untargeted, High Resolution Metabolomics Study.
OBJECTIVE To conduct an untargeted, high resolution exploration of metabolic pathways that was altered in association with hepatic steatosis in adolescents. STUDY DESIGN This prospective, case-control study included 39 Hispanic-American, obese adolescents aged 11-17 years evaluated for hepatic steatosis using magnetic resonance spectroscopy. Of these 39 individuals, 30 had hepatic steatosis ≥...
متن کاملStructured plant metabolomics for the simultaneous exploration of multiple factors
Multiple factors act simultaneously on plants to establish complex interaction networks involving nutrients, elicitors and metabolites. Metabolomics offers a better understanding of complex biological systems, but evaluating the simultaneous impact of different parameters on metabolic pathways that have many components is a challenging task. We therefore developed a novel approach that combines...
متن کاملAutomated LC-HRMS(/MS) Approach for the Annotation of Fragment Ions Derived from Stable Isotope Labeling-Assisted Untargeted Metabolomics
Structure elucidation of biological compounds is still a major bottleneck of untargeted LC-HRMS approaches in metabolomics research. The aim of the present study was to combine stable isotope labeling and tandem mass spectrometry for the automated interpretation of the elemental composition of fragment ions and thereby facilitate the structural characterization of metabolites. The software tool...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 113 48 شماره
صفحات -
تاریخ انتشار 2016